Correction of the significance level when attempting multiple transformations of an explanatory variable in generalized linear models
نویسندگان
چکیده
BACKGROUND In statistical modeling, finding the most favorable coding for an exploratory quantitative variable involves many tests. This process involves multiple testing problems and requires the correction of the significance level. METHODS For each coding, a test on the nullity of the coefficient associated with the new coded variable is computed. The selected coding corresponds to that associated with the largest statistical test (or equivalently the smallest pvalue). In the context of the Generalized Linear Model, Liquet and Commenges (Stat Probability Lett,71:33-38,2005) proposed an asymptotic correction of the significance level. This procedure, based on the score test, has been developed for dichotomous and Box-Cox transformations. In this paper, we suggest the use of resampling methods to estimate the significance level for categorical transformations with more than two levels and, by definition those that involve more than one parameter in the model. The categorical transformation is a more flexible way to explore the unknown shape of the effect between an explanatory and a dependent variable. RESULTS The simulations we ran in this study showed good performances of the proposed methods. These methods were illustrated using the data from a study of the relationship between cholesterol and dementia. CONCLUSION The algorithms were implemented using R, and the associated CPMCGLM R package is available on the CRAN.
منابع مشابه
Author ' s response to reviews Title : Correction of the significance level when attempting multiple transformations of an explanatory variable in
متن کامل
A Generalized Linear Statistical Model Approach to Monitor Profiles
Statistical process control methods for monitoring processes with univariate ormultivariate measurements are used widely when the quality variables fit to known probabilitydistributions. Some processes, however, are better characterized by a profile or a function of qualityvariables. For each profile, it is assumed that a collection of data on the response variable along withthe values of the c...
متن کاملSingle-Vehicle Run-Off-Road Crash Prediction Model Associated with Pavement Characteristics
This study aims to evaluate the impact of pavement physical characteristics on the frequency of single-vehicle run-off-road (ROR) crashes in two-lane separated rural highways. In order to achieve this goal and to introduce the most accurate crash prediction model (CPM), authors have tried to develop generalized linear models, including the Poisson regression (PR), negative binomial regression (...
متن کاملبهکارگیری مدل جمعیتعمیمیافته در تعیین نوع ارتباط عوامل خطر رتینوپاتی در بیماران دیابتی شهر تهران
Background : One of the most important complications of diabetes, is diabetic retinopathy that causes the blindness of 10,000 people every year. Different researches have been done on retinopathy risk factors in diabetic patients. This study was carried out to check the type of relationship between retinopathy risk factors and the condition of temptation it with generalized additive models. T...
متن کاملPenalized Bregman Divergence Estimation via Coordinate Descent
Variable selection via penalized estimation is appealing for dimension reduction. For penalized linear regression, Efron, et al. (2004) introduced the LARS algorithm. Recently, the coordinate descent (CD) algorithm was developed by Friedman, et al. (2007) for penalized linear regression and penalized logistic regression and was shown to gain computational superiority. This paper explores...
متن کامل